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(2JT)' Abstract 



We consider the problem of building robust fuzzy extractors, which allow two parties holding 
similar random variables W, W to agree on a secret key R in the presence of an active adversary. 
Robust fuzzy extractors were defined by Dodis et al. in Crypto 2006 to be noninteractive, i.e., 
only one message P, which can be modified by an unbounded adversary, can pass from one party 
fy/j to the other. This allows them to be used by a single party at different points in time (e.g 



for key recovery or biomctric authentication), but also presents an additional challenge: what 
if R is used, and thus possibly observed by the adversary, before the adversary has a chance 



to modify P. Fuzzy extractors secure against such a strong attack are called post-application 
robust. 



We construct a fuzzy extractor with post-application robustness that extracts a shared secret 
key of up to (2m — n)/2 bits (depending on error-tolerance and security parameters), where n 
is the bit-length and m is the entropy of W . The previously best known result, also of Dodis et 
Q\ \ al., extracted up to (2m — n)/3 bits (depending on the same parameters). 

o, 

^ . 1 Introduction 

O. 

Consider the following scenario. A user Charlie has a secret w that he wants to use to encrypt and 

authenticate his hard drive. However, w is not a uniformly random key; rather, it is a string with 

some amount of entropy from the point of view of any adversary A. Naturally, Charlie uses an 

S^ i extractor [NZ96J . which is a tool for converting entropic strings into uniform ones. An extractor 

Ext is an algorithm that takes the entropic string w and a uniformly random seed i, and computes 

R = Ext(w; i) that is (almost) uniformly random even given i. 

It may be problematic for Charlie to memorize or store the uniformly random R (this is in 
contrast to w, which can be, for example, a long passphrase already known to Charlie, his biometric, 
or a physical token, such as a physical one-way function [PRTG02] ) . Rather, in order to decrypt 
the hard drive, Charlie can use i again to recompute R = Ext(w;i). The advantage of storing i 
rather than R is that % need not be secret, and thus can be written, for example, on an unencrypted 
portion of the hard drive. 

Even though the storage of i need not be secret, the authenticity of i is very important. If A 
could modify i to i', then Charlie would extract some related key R', and any guarantee on the 
integrity of the hard drive would vanish, because typical encryption and authentication schemes do 
not provide any security guarantees under related- key attacks. To authenticate i, Charlie would 
need to use some secret key, but the only secret he has is w. 



This brings us to the problem of building robust extractors: ones in which the authenticity of the 
seed can be verified at reconstruction time. A robust extractor has two procedures: a randomized 
Gen(u>), which generates (R,P) such that R is uniform even given P (think of P as containing the 
seed i as well as some authentication information), and Rep(w,P'), which reproduces R\i P 1 = P 
and outputs _L with high probability for an adversarially produced P' ^ P. 

Note that in the above scenario, the adversary A, before attempting to produce P' ^ P, gets to 
see the value P and how the value R is used for encryption and authentication. Because we want 
robust fuzzy extractors to be secure for a wide variety of applications, we do not wish to restrict 
how R is used and, therefore, what information about R is available to A. Rather, we will require 
that A has low probability of getting Rep(u>, P') to not output _L even if A is given both P and R. 
This strong notion of security is known as post- application robustness. 

An additional challenge may be that the value w when Gen is run is slightly different from the 
value w' available when Rep is run: for example, the user may make a typo in a long passphrase, 
or a biometric reading may differ slightly. Extractors that can tolerate such differences and still 
reproduce R exactly are called fuzzy [DORS08] . Fuzzy extractors are obtained by adding error- 
correcting information to P, to enable Rep to compensate for errors in w'. The specific constructions 
depend on the kinds of errors that can occur (e.g., Hamming errors, edit distance errors, etc.). 

Robust (fuzzy) extractors are useful not only in the single-party setting described above, but also 
in interactive settings, where two parties are trying to derive a key from a shared (slightly different 
in the fuzzy case) secret w that either is nonuniform or about which some limited information is 
known to the adversary A. One party, Alice, can run Gen to obtain (R, P) and send P to the 
other party, Bob, who can run Rep to also obtain R. However, if A is actively interfering with 
the channel between Alice and Bob and modifying P, it is important to ensure that Bob detects 
the modification rather than derives a different key R' . Moreover, unless Alice can be sure that 
Bob truly received P before she starts using R in a communication, post-application robustness is 
needed. 

Prior Work. Fuzzy extractors, defined in [DORS08], arc essentially the noninteractive variant 
of privacy amplification and information reconciliation protocols, considered in multiple works, 
including |Wyn75j IBBR881 IMau931 IBBCM95J . Robust (fuzzy) extractors, defined in JBDK+051 
IDKRS06] , are the noninteractive variant of privacy amplification (and information reconciliation) 
secure against active adversaries |Mau971 IMW971 IWol981 IMW031 lRW03l IRW04J . 

Let the length of w be n and the entropy of w be m. Post-application robust fuzzy extractors 
cannot extract anything out of w if m < n/2, because an extractor with post-application robustness 
implies an information-theoretically secure message authentication code (MAC) with w as the 
ke3o, which is impossible if m < n/2 (see |DS02| for impossibility of deterministic MACs if m < 
n/2 and its extension by [Wic08j to randomized MACs). Without any set-up assumptions, the 
only previously known post- application robust extractor, due to |DKRS06] . extracts R of length 
g(m — n/2 — log ^) (or even less if R is required to be very close to uniform), where 5 is the 
probability that the adversary violates robustness. Making it fuzzy further reduces the length of R 
by an amount related to the error-tolerance. (With set-up assumptions, one can do much better: 
the construction of [CDF + 08| extracts almost the entire entropy m, reduced by an amount related 
to security and, in the fuzzy case, to error-tolerance. However, this construction assumes that a 
nonsecret uniformly random string is already known to both parties, and that the distribution on 



1 The MAC is obtained by extracting R, using it as a key to any standard information-theoretic MAC (e.g., |WC81| ). 
and sending P along with the tag to the verifier 



w, including adversarial knowledge about w, is independent of this string.) 

Our Results. The robust extractor construction of [DKRS06] is parameterized by a value v that 
can be decreased in order to obtain a longer R. In fact, as shown in [DKRS06], a smaller v can be 
used for pre- application robustness (a weaker security notion, in which A gets P but not R). We 
show in Theorem [2] that the post-application-robustness analysis of [DKRS06] is essentially tight, 
and if v is decreased, the construction becomes insecure. 

Instead, in Section El we propose a new construction of an extractor with post-application 
robustness that extracts R of length m — n/2 — log -r, improving the previous result by a factor of 
3/2 (more if R is required to be very close to uniform). While this is only a constant-factor increase, 
in scenarios where secret randomness is scarce it can make a crucial difference. Like [DKRS06] . we 
make no additional set-up assumptions. Computationally, our construction is slightly more efficient 
than the construction of [DKRS06] . Our improved robust extractor translates into an improved 
robust fuzzy extractor using the techniques of [DKRS06], with the same factor of 3/2 improvement. 

In addition, we show (in Section [32]) a slight improvement for the pre-application robust version 
of the extractor of [DKRS06J, applicable when the extracted string must be particularly close to 
uniform. 

2 Preliminaries 

Notation. For binary strings a, b, a\ \b denotes their concatenation, \a\ denotes the length of a. For 
a binary string a, for we denote by [a]\, the substring b = aiai+i . . .aj. If S is a set, x <— S means 
that x is chosen uniformly from S. If X is a probability distribution (or a random variable), then 
x <— X means that x is chosen according to distribution X. If X and Y are two random variables, 
then X x Y denotes the product distribution (obtained by sampling X and Y independently). All 
logarithms are base 2. 

Random Variables, Entropy, Extractors. Let U\ denote the uniform distribution on {0, 1}'. 
Let X\,X2 be two probability distributions over some set S. Their statistical distance is 



SD (X 1 ,X 2 ) d = f maxiPrLYi e T] - Pr[X 2 £T]}=^ 

ses 



Pr\s] - PrLsl 
Xi x 2 



(they are said to be e-close if SD (Xi,X2) < e). We will use the following lemma on statistical 
distance that was proven in [DKRS08] : 

Lemma 1. For any joint distribution (A,B) and distributions C and D over the ranges of A and 
B respectively, if SD ((A, B),C x D) < a, then SB({A,B),C x B) < 2a. 

MlN-ENTROPY. The min-entropy of a random variable W is Ho^VF) = — log(max„,Pr[W = w\) 
(all logarithms are base 2, unless specified otherwise). Following |DORS08] . for a joint distribution 
(W, E), define the (average) conditional min-entropy of W given E as 

Hoo(W | E) = -log( E (2- H ~W£= e ))) 

(here the expectation is taken over e for which Pr[E = e] is nonzero). A computationally unbounded 
adversary who receives the value of E cannot find the correct value of W with probability greater 
than 2- H °°( M/ l £; ). We will use the following lemma from |DORS08j : 



Lemma 2. Let A,B,C be random variables. If B has at most 2 X possible values, then the fol- 
lowing holds: Hoo01|5,C) > Hoo((A B)\C) - A > Hoo(A\C) - A. In particular, H^AlB) > 

U 00 {(A,B))-X>H 00 (A)-X. 

Because in this paper the adversary is sometimes assumed to have some external information 
E about Alice and Bob's secrets, we need the following variant, defined in |DORS08l Definition 2], 
of the definition of strong extractors of [NZ96] : 

Definition 1. Let Ext : {0, l} n — > {0, 1} be a polynomial time probabilistic function that uses r 
bits of randomness. We say that Ext is an average-case (n, m, I, e)-strong extractor if for all pairs 
of random variables (W, E) such that w £ W is an n-bit string and Hoc (IF | E) > to, we have 
SD ((Ext(W; X), X, E), (U U X, E))<e, where X is the uniform distribution over {0, l} r . 

Any strong extractor can be made average-case with a slight increase in input entropy |DQRS08| 
Section 2.5]. We should note that some strong extractors, such as universal hash functions |CW79|, 
IHILL99) discussed next, generalize without any loss to average-case. 

The Leftover Hash Lemma We first recall the notion of universal hashing |CW79j : 

Definition 2. A family of efficient functions TL = {hi : {0, l} n — ► {0, 1} } p/ is universal if for all 
distinct x,x' we have Prj^/[/ij(x) = hi(x')] < 2~ l . 

TL is pairwise independent if for all distinct x,x' and all y,y' it holds that Pri<=j[hi(x) = y A 
h l {x>)=y>]<2- 2i . 

Lemma 3 (Leftover Hash Lemma, average-case version |DORS08] ). For £,m,e > 0, TL is a strong 
(m, e) average-case extractor (where the index of the hash function is the seed to the extractor) if 
TL is universal and £ < m + 2 — 2 log - . 

This Lemma easily generalizes to the case when TL is allowed to depend on the extra information 
E about the input X. In other words, every function in TL takes an additional input e, and the 
family TL is universal for every fixed value of e. 

Secure Sketches and Fuzzy Extractors. We start by reviewing the definitions of secure 
sketches and fuzzy extractors from |DORS08] , Let A4 be a metric space with distance function 
dis (we will generally denote by n the length of each element in A4). Informally, a secure sketch 
enables recovery of a string w € M. from any "close" string w' € M. without leaking too much 
information about w. 

Definition 3. An (m, m, t)-secure sketch is a pair of efficient randomized procedures (SS,SRec) 
s.t.: 

1. The sketching procedure SS on input w £ A4 returns a bit string s € {0, 1}*. The recovery 
procedure SRec takes an element w' € M. and s € {0, 1}*. 

2. Correctness: If dis(uvu/) < t then SRec(u>', SS(u>)) = w. 

3. Security: For any distribution W over M with min-entropy m, the (average) min-entropy 
of W conditioned on s does not decrease very much. Specifically, if H 00 (PF) > m then 
UooiW | SS(W)) > to. 

The quantity to — to is called the entropy loss of the secure sketch. 



In this paper, we will construct a robust fuzzy extractor for the binary Hamming metric using se- 
cure sketches for the same metric. We will briefly review the syndrome construction from |DORS08| 
Construction 3] that we use (see also references therein for its previous incarnations). Consider an 
efficiently decodable [n, n — k, It + 1] linear error-correcting code C. The sketch s = SS(w) consists 
of the k-bit syndrome w with respect to C. We will use the fact that s is a (deterministic) linear 
function of w and that the entropy loss is at most \s\ = k bits in the construction of our robust 
fuzzy extractor for the Hamming metric. 

We note that, as was shown in [DKRS06], the secure sketch construction for the set difference 
metric of [DORS08J can be used to extend the robust fuzzy extractor construction in the Hamming 
metric to the set difference metric. 

While a secure sketch enables recovery of a string w from a close string w', a fuzzy extractor 
extracts a close-to-uniform string R and allows the precise reconstruction of R from any string w' 
close to w. 

Definition 4. An (m, £, t, e)-fuzzy extractor is a pair of efficient randomized procedures (Gen, Rep) 
with the following properties: 

1. The generation procedure Gen, on input w € M, outputs an extracted string R € {0, 1} and 
a helper string P £ {0, 1}*. The reproduction procedure Rep takes an element w' € M and 
a string P £ {0, 1}* as inputs. 

2. Correctness: If dis(u>,u/) < t and (R,P) <— Gen (to), then Rep(w/,P) = R. 

3. Security: For any distribution W over M with min-entropy m, the string R is close to uniform 
even conditioned on the value of P. Formally, if H OC) (H /r ) > m and (R,P) <— Gen (IF), then 
we have SD ((R, P),U e x P) < e. 

Note that fuzzy extractors allow the information P to be revealed to an adversary without 
compromising the security of the extracted random string R. However, they provide no guarantee 
when the adversary is active. Robust fuzzy extractors defined (and constructed) in [DKRS06] 
formalize the notion of security against active adversaries. We review the definition below. 

If W,W' are two (correlated) random variables over a metric space M, we say dis(IF, W) < t 
if the distance between IF and IF' is at most t with probability one. We call (IF, IF') a (t, m)-pair 
if dis(W, IF') < t and Hoo (FF) > m. 

Definition 5. An (m, £, t, e)-fuzzy extractor has post-application (resp., pre-application) robustness 
5 if for all (t, m)-pairs (IF, IF') and all adversaries A, the probability that the following experiment 
outputs "success" is at most 5: sample (w,w f ) from (FF, IF'); let (R,P) = Gen(w); let P = A(R,P) 
(resp., P = A(P)); output "success" if P ^ P and Rep(u/, P) /_L. <> 

We note that the above definitions can be easily extended to give average-case fuzzy extractors 
(where the adversary has some external information E correlated with W), and that our construc- 
tions satisfy those stronger definitions, as well. 

3 The New Robust Extractor 

In this section we present our new extractor with post-application robustness. We extend it to 
a robust fuzzy extractor in Section [5j Our approach is similar to that of [DKRS06]; a detailed 
comparison is given in Section HJ 



Starting point: key agreement secure against a passive adversary. Recall that a strong 
extractor allows extraction of a string that appears uniform to an adversary even given the presence 
of the seed used for extraction. Therefore, a natural way of achieving key agreement in the errorless 
case is for Alice to pick a random seed i for a strong extractor and send it to Bob (in the clear). 
They could then use R = Ext(w;i) as the shared key. As long as the adversary is passive, the 
shared key looks uniform to her. However, such a protocol can be rendered completely insecure 
when executed in the presence of an active adversary because A could adversarially modify i to 
i' such that R' extracted by Bob has no entropy. To prevent such malicious modification of i we 
will require Alice to send an authentication of i (along with i) to Bob. In our construction, we 
authenticate i using w as the key and then extract from w using i as the seed. Details follow. 

Construction. For the rest of the paper we will let w € {0, l} n . We will assume that n is 
even (if not, drop one bit of w, reducing its entropy by at most 1). To compute Gen (to), let a 
be the first half of w and b the second: a = [w]™ ,b = [Hn/2+r View a, b as elements of F 2 „/ 2 . 
Let v = n — m + log -?, where 5 is the desired robustness. Choose a random % G F 2 „/ 2 . Compute 
y = ia + b. Let a consist of the first v bits of y and the extracted key it! consist of the rest of y: 
a = [y]l, R = [y}%\. Output P = (i,a). 

Gen(w): 

1. Let a= H" /2 ,fr= M£ /2+1 

2. Select a random i <— F 2 „/ 2 

3. Set a = [ia + b]\, R = [ia + b] v+1 and output P = (i, a) 



Rep{w,P' = (i',a')): 

1. Let a= Hi /2 ,6= [u>]JJ /2+1 



n/2 



2. If a' = [i'a + b]± then compute R' = [i'a + b] v+l else output _L 

Theorem 1. Let A4 = {0,1}". Setting v = n/2 — £, the above construction is an (m,£,0,e) — 
fuzzy extractor with robustness 6, for any m, £, e, 5 satisfying £ < m — n/2 — log -? as long as m > 
n/2 + 2 log i. 

If e is so low that the constraint m > n/2 + 2 log - is not satisfied, then the construction can 
be modified as shown in Section 13.11 

Proof. Extraction. Our goal is to show that R is nearly uniform given P. To do so, we first show 
that the function hi(a, b) = (a, R) is a universal hash family. Indeed, for (a, b) ^ (a', b') consider 

Pr[hi(a,b) = hi(a',b')} = Pr[ia + b = ia' + b'] 

i i 

= Pr[i(a - a') = (b - b')\ 

i 

< 2~ n / 2 . 

To see the last inequality recall that (a, b) ^ (a',b'). Therefore, if a = a', then b ^ b' making the 
Pii[i(a — a') = (b — b')] =0. If a ^ a', then there is a unique i = (b — b')/(a — a') that satisfies the 
equality. Since i is chosen randomly from F 2 „/ 2 , the probability of the specific i occurring is 2~ n ' 2 . 



Because \(R, a)\ = n/2, Lemma [3] gives us SD ((P, P), C/i^i x U\ P n < e/2 as long as n/2 < 
m + 2 — 2 log-, or, equivalently, (R, P) is 2( n//2 ~ m )/ 2_1 -close to U\r\ x C/jpi. Applying Lemma CQ 
to ^4 = P, B = P, C = Un_ v , D = Ur x U v , we get that (R,P) is e-close to U/n\_ v x P, for 
e = 2( n / 2_m )/ 2 . From here it follows that for extraction to be possible, m > n/2 + 2 log -. 

Post- Application Robustness. In the post-application robustness security game, the adversary 
A on receiving (P = (i,a),R) (generated according to procedure Gen) outputs P' = (i',a'), and is 
considered successful if (P' 7^ P) A [i'a + b]\ = a'. In our analysis, we will assume that i' 7^ i. We 
claim that this does not reduce ,4's success probability. Indeed, if i' = i then, for P' 7^ P to hold, 
A would have to output a' 7^ a. However, when i' = i, Rep would output _L unless a' = a. 

In our analysis, we allow A to be deterministic. This is without loss of generality since we allow 
an unbounded adversary. We also allow A to arbitrarily fix i. This makes the result only stronger 
since we demonstrate robustness for a worst-case choice of i. 

Since i is fixed and A is deterministic, (a, R) determines the transcript tr = (i, a, R, i' , a'). For 
any particular tr, let Succ tr be the event that the transcript is tr and A wins, i.e., that ia + b = 
cr\\R A [i'a + b]\ = a'. We denote by Bad tr the set of w = a\\b that make Succ tr true. For any tr, 
Pr^fSucct,-] < |Bad tr |2 _m , because each w in Bad tr occurs with probability at most 2~ m . We now 
partition the set Bad tr into 2 e disjoint sets, indexed by R G {0, 1}^: 

Bad£' = {w I w G Bad tr A [i'a + b] e v+1 = R'} 

= {w\{ia + b = a\\R) A (i'a + b = a'\\R')} 

For a particular value of (tr, R), w = a\\b is uniquely determined by the constraints that define 
the above set Therefore, |Bad£'| = 1. Since Bad tr = Ui?' G {o,i}^ Bad tf . we § et l Bad tr| <2 e = 2 n / 2 ~ v . 
From here it follows that 

Pr[Succ tr ] < |Bad tr |2^ m < 2 n / 2 ~ v ~ m . 

Pr[Succ tr ] measures the probability that the transcript is tr and A succeeds. To find out the 
probability that A succeeds, we need to simply add Pr[Succ tr ] over all possible tr. Since a transcript 
is completely determined by <r, R, the total number of possible transcripts is 2< a < + ' < = 2 n > 2 and, 
therefore, A's probability of success is at most 2 n ~ v ~ m . 

To achieve ^-robustness, we need to set v to at least n — m + log -g. From here it follows that 
l=^-v <\{2m-n-2\og\). U 

3.1 Getting Closer to Uniform 

If e is so low that the constraint m > n/2 + 2 log - is not satisfied, then in our construction we can 
simply shorten R by (3 = n/2 + 2 log — m bits, as follows: keep v = n — m + log -? (regardless of £), 
and let R = [ia + b] ^JT \ , for any £ < 2m — n — log ^ — 2 log - . This keeps a the same, but shortens R 
enough for the leftover hash lemma to work. The proof remains essentially the same, except that 
to prove robustness, we will give the remaining bits [ia + b]™^ v+1 for free to A. 



3.2 Improving the construction of [DKRS06] When the Uniformity Constraint 
Dominates 

The construction of Dodis et al. [DKRS06] parses w as two strings a and b of lengths n — v and v, 
respectively. The values a, R are computed as a = [ia]\ + b and R = [ia]™ +1 ; P = (i,a). In order to 

7 



get R to be uniform given P, the value v is increased until the leftover hash lemma can be applied 
to (R,a). However, we observe that this unnecessarily increases the length of a (i.e., for every bit 
added to v, two bits are subtracted from R). Instead, we propose to improve this construction 
with essentially the same technique as we use for our construction in Section 13.11 The idea is to 
simply shorten R without increasing the length of a. This improvement applies to both pre- and 
post-application robustness. 

For post-application robustness, suppose the uniformity constraint dominates, i.e., 2 log- > 
(2m — n + log j)/3. Modify the construction of [DKRS06] by setting v = (2n — m + log ^)/3 and 

R = [ia]yZi~ j where j3 = 2 log (2m — n — log^)/3. This will result in an extracted key of 

length £ = (4m — 2n — log ^)/3 — 2 log -. However, even with the improvement, the extracted key 
will be always shorter than the key extracted by our scheme, as explained in Section [4.21 

In contrast, this improvement seems useful in the case of pre-application robustness. Again, sup- 
pose the uniformity constraint dominates, i.e., 2 log - > log j. Modify the construction of |DKRS06] 
by setting v = n — m + log i and R = [ia]^Zi~ , where f3 = 2 log - — log ^. This will result in an 

extracted key of length £ = 2m — n — 2 log - — log -? , which is 2 log log -? longer than the key 

extracted without this modification. 



4 Comparison with the construction of [DKRS06 



4.1 When the Robustness Constraint Dominates 

Recall that the construction of Dodis et al. |DKRS06] parses w as two strings a and b of lengths 
n — v and v, respectively. The values a, R are computed as a = [ia}\ + b and R = [ia]™ +1 ; P = (i, a). 
Notice that, like in our construction, increasing v improves robustness and decreases the number 
of extracted bits. For pre-application robustness, setting v = n — m + log -? suffices, and thus the 
construction extracts nearly (2m — n) bits. However, for post-application robustness, a much higher 
v is needed, giving only around 3 (2m — n) extracted bits. 

The post-application robustness game reveals more information to A about w than the pre- 
application robustness game. This additional information — namely, R — may make it easier for A 
to guess a' for a well-chosen i'. The key to our improvement is in the pairwise independence of the 
function ia + b that computes both a and R: because of pairwise independence, the value (a, R) of 
the function on input i tells A nothing about the value (a 1 , R') on another input i' . (This holds, of 
course, for uniformly chosen key (a, b); when (a, b) has entropy m, then A can find out n — m bits 
of information about a' .) 

In contrast, in the construction of [DKRS06J, only a is computed using a pairwise independent 
hash function. This works well (in fact, better than our construction, because b can be shorter) 
for pre-application robustness, where A does not find out R. But it makes it possible for R to 
decrease ^4's uncertainty about a' by as much as £ = \R\, thus necessitating the length v of a' (and 
hence a) to be v > £ + (n — in) (the (n — m) term is the amount of entropy already potentially 
"missing" from a' because of the nonuniformity of w). See Section [4.31 for a detailed description of 
an adversarial strategy that utilizes R to obtain a' in the [DKRS06] construction. 

Another way to see the differences between the two constructions is through the proof. In the 
proof of post-application robustness, the transcript tr includes R, which makes for 2^ times more 
transcripts than in the proof of pre-application robustness. However, the fact that this R imposes an 
additional constraint of w, thus reducing the size of the set Bad tr , can compensate for this increase. 



It turns out that for the construction of [DKRS06] . this additional constraint can be redundant 
if the adversary is clever about choosing i! and a', and the size of Bad tr doesn't decrease. Using 
a pairwise-independent function for computing R in our construction ensures that this additional 
constraint decreases the size of Bad tr by 2 e . Thus, our construction achieves the same results for 
pre- and post-application robustness. 

4.2 When the Uniformity Constraint Dominates 

It should be noted that there may be reasonable cases when the uniformity constraint e on R 
is strong enough that the construction of |DKRS06] extracts even fewer bits, because it needs 
to take v > n — m + 2 log - to ensure near-uniformity of R given P. In that case, as long as 
m > n/2 + 2 log - , our construction will extract the same amount of bits as before, thus giving it an 
even bigger advantage. And when m < n/2+2 log -, our construction still extracts at least 3/2 times 
more bits than the construction of [DKRS06] , even with the improvement of Section 13.21 applied 
(this can be seen by algebraic manipulation of the relevant parameters for the post-application 
robustness case). 



4.3 Why the construction of [DKRS06] cannot extract more bits 



Recall that the robust fuzzy extractor of [DKRS06] operates as follows: parse w as two strings a, b 
of lengths n — v , v respectively and compute a = [ia]\ + b and R = [ia]" +1 ; P = (i, a). 

For post-application robustness, the concern is that R can reveal information to the adversary 
about a' for a cleverly chosen i! '. Because the length of a' is v and £ + (n — m) bits of information 
about a 1 may be available (the £ term comes from \R\, and (n — m) term comes from the part 
of w which has no entropy) , this leads to the requirement that v > £ + n — m + log ^ to make 
sure the adversary has to guess at least log I bits about a'. Plugging in £ = n — 2v, we obtain 
£ < 3 (jn — n/2 — log ^ ) , which is the amount extracted by the construction. 

Here we show an adversarial strategy that indeed utilizes R to obtain information about a' to 
succeed with probability 5/2. This demonstrates that the analysis in [DKRS06J is tight up to one 
bit. To do so we have to fix a particular (and somewhat unusual) representation of field elements. 
(Recall that any representation of field elements works for constructions here and in [DKRS06] . 
as long as addition of field elements corresponds to the exclusive-or of bit strings.) Typically, 
one views F 2 n-« as ¥2[x]/(p(x)) for some irreducible polynomial p of degree n — v, and rep- 
resents elements as F2-valued vectors in the basis {x n ~ v ~ 1 , x n ~ v ~ 2 , ...,x 2 ,x, 1). We will do the 
same, but will reorder the basis elements so as to separate the even and the odd powers of x: 
(x n ~ v ~ 1 , x n ~ v ~ 3 , . . . , x, x n ~ v ~ 2 , x n ~ v ~ 4 , . . . , 1) (assuming, for concreteness, that n — v is even). The 
advantage of this representation for us is that the top half of bits of some value z € F 2 n-v is equal 
to the bottom half of the bits of z/x, as long as the last bit of z is 0. 

Now suppose the distribution on w is such that the top n — m bits of b are (the rest of the 
bits of w are uniform). Then by receiving a and R, the adversary gets to see the top £ + (n — m) 
bits of ia. Therefore, the adversary knows £ + (n — m) bits from the bottom half of ia/x as long as 
the last bit of ia is 0, which happens with probability 1/2. To use this knowledge, the adversary 
will simply ensure that the difference between a' and a is [ia/x]^, by letting i' = i + i/x. 

Thus, the adversarial strategy is as follows: let i' = i + i/x; let r consist of the £ bits of R, 
the top n — m bits of a, and log t = v — £ — (n — m) randomly guessed bits, and let a' = a + r. 



The adversary wins whenever r = [ia/x]i, which happens with probability 2 t,-f- ( n ~ m )/2 = 8/2, 
because all but log t bits of r are definitely correct as long as the last bit of ia is 0. 
The above discussion gives us the following result. 

Theorem 2. There exists a basis for GF(2 n ~ v ) such that for any integer m there exists a dis- 
tribution W of min-entropy m for which the post- application robustness of the construction from 
\DKRS6h\ Theorem 3] can be violated with probability at least 5/2, where v is set as required for 
robustness 5 by the construction (i.e., v = (n — £)/2 for £ = (2m — n — 2 log -g)/3). 

Note that our lower bound uses a specific representation of field elements, and hence does not 
rule out that for some particular representation of field elements, a lower value of v and, therefore, 
a higher value of £ is possible. However, a security proof for a lower value of v would have to then 
depend on the properties of that particular representation and would not cover the construction 
of [DKRS06J in general. 

5 Tolerating Binary Hamming Errors 

We now consider the scenario where Bob has a string w' that is close to Alice's input w (in the 
Hamming metric). In order for them to agree on a random string, Bob would first have recover w 
from w' . To this end, Alice could send the secure sketch s = SS(u;) to Bob along with (i,o~). To 
prevent an undetected modification of s to s', she could send an authentication of s (using w as the 
key) as well. The nontriviality of making such an extension work arises from the fact that modifying 
s to s' also gives the adversary the power to influence Bob's verification key w* = SRec(w' , s'). The 
adversary could perhaps exploit this circularity to succeed in an active attack (the definition of 
standard authentication schemes only guarantee security when the keys used for authentication 
and verification are the same). 

We break this circularity by exploiting the algebraic properties of the Hamming metric space, 
and using authentication secure against algebraic manipulation [DKRS061 ICDF + 08] . The tech- 
niques that we use are essentially the same as used in |DKRS06] . but adapted to our construction. 
We present the construction here and then discuss the exact properties that we use in the proof of 
security. 

Construction. Let M. be the Hamming metric space on {0, l} n . Let W be a distribution of min- 
entropy m over A4. Let s = SS(w) be a deterministic, linear secure sketch; let |s| = k, n' = n — k. 
Assume that SS is a surjective linear function (which is the case for the syndrome construction for 
the Hamming metric mentioned in Section [2]). Therefore, there exists a k x n matrix S of rank k 
such that SS(w) = Sw. Let S be ann'xii matrix such that n x n matrix (<^r) nas full rank. We 
let SS ± (w) =S ± (w). 

To compute Gen(u;), let s = SS(w), c = SS (w); \c\ = n' . We assume that n' is even (if 
not, drop one bit of c, reducing its entropy by at most 1). Let a be the first half of c and b 
the second. View a, b as elements of F 2 „// 2 . Let L = 2\—~\ (it will important for security that 
L is even). Pad s with 0s to length Ln'/2, and then split it into L bit strings sl-i,- ■ ■ ,so of 
length n'/2 bits each, viewing each bit string as an element of F 2 „// 2 . Select i <— F 2 „'/ 2 . Define 
f s ,i{x) = x L+3 + x 2 (s L -ix L ~ l + s L - 2 x L ~ 2 + ■■■ + s ) + ix. Set a = [f s ,i{a) + b]\, and output 

P = (s,i,a) and R = [f s ,i(a) + b]™^. 
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Gen(w): 

1. Set s = SS(w), c = SS (w), k = \s\, n' = \c\. 

- Let a = [c]f /2 , b = [c# /a+1 

- Let L = 2[^]. Pad s with Os to length Lri/2. 

- Parse the padded s as sl_i||sl_2|| . • . ||sn for s, G F 2 „// 2 . 

2. Select i <— F 2 „// 2 . 

3. Set <7 = [/ s ,i(a) + 6]j, and output R = [f s ,i(a) + &]™+i and P = (s, i, a). 

Rep(w/,P' = (s',i',a')): 

1. Compute w;* = SRec(-u/, s') 

- Verify that dis(u;*,u/) < t and SS(iu*) = s'. If not, output _L. 

2. Let c' = SS ± (w*). Parse c' as a'\\b'. 

3. Compute a* = [f s ',i'(a f ) + b']\. 

- Verify that a* = a'. If so, output R = [f s ',i'(a f ) + 6']"-t-i > e ^ se output ±. 

In the theorem statement below, let B denote the volume of a Hamming ball or radius t in 
{0, l} n (log B <nH 2 (t/n) [MS22l Chapter 10, §11, Lemma 8] and log 5 < tlog(n + 1) [DKRS06J ). 

Theorem 3. Assume SS is a deterministic linear (m,m — k,t)— secure sketch of output length 
k for the Hamming metric on {0, l} n . Setting v = (n — k)/2 — I, the above construction is an 
(m, l,t,e) fuzzy extractor with robustness 5 for any m,l,t,e satisfying I < m — n/2 — k — log B — 

log (2 



+ 2 ) — log 4 as long as m > \ in + k) + 2 log \ . 



n—k 

Again, if m < ^(n + k) + 2 log -, the construction can be modified, as shown in Section [5.11 

Proof. Extraction. Our goal is to show that R is nearly uniform given P = (i,s,a). To do so, 
we first note that for every s, the function hi(c) = (o~,R) is a universal hash family. Indeed for 
c 7^ d there is a unique % such that hi(c) = hi(d) (since i(a — a') is fixed, like in the errorless case). 
We also note that Hoo(c | SS(W)) > Hoo(c, SS(W)) -k = U^W) - k = m - k by Lemma EJ 
Because \(R, a)\ = n'/2, Lemma[3](or, more precisely, its generalization mentioned in the paragraph 
following the lemma, needed here because hi depends on s) gives us 

SD ((P,P), U m x SS(W) x U n , /2 x U v ) < e/2 

for ri/2 <m-k + 2 - 21og(2/e). This is equivalent to saying that (R, P) is 2 (n ' /2 - m+fc) 5- 1 - C i OS e 
to U\ R i x SS(M^) x U n i/ 2 x U v . 

Applying Lemma[Uto A = R, B = P, C = U n i/ 2 - v , D = SS(w) x U n i 1% x U v , we get that (R, P) 

is e-close to U n , x P, for e = 2(T" m + fc )/ 2 . 

From here it follows that for extraction to be possible, m > |(n + k) + 2 log - . 

Post- Application Robustness. In the post-application robustness security game, the adversary 
A on receiving (P = (s,i,a),R) (generated according to procedure Gen) outputs P' = (s',i',a ! ), 
and is considered successful if (P' 7^ P) A Rep(u/,s') 7^ _L. In our analysis, we will assume that 
(i', s') 7^ (i, s). We claim that this does not reduce .A's success probability. Indeed, if (i' , s') = (i, s) 
then, d computed within Rep will equal c. So, for P' 7^ P to hold, A would have to output a' 7^ a. 
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However, when (i',c',s') = (i,c,s), Rep would compute a* = a, and therefore would output _L 
unless a' = a. 

In our analysis, we allow A to be deterministic. This is without loss of generality since we allow 
an unbounded adversary. We also allow A to arbitrarily fix i. This makes the result only stronger 
since we demonstrate robustness for a worst-case choice of i. 

Since i is fixed and A is deterministic, the tr = (i, s, a, R, il , s' , a') is determined completely by 
(s, a, R). Recall that the prime challenge in constructing a robust fuzzy extractor was that A could 
somehow relate the key used by Rep to verify a' to the authentication key that was used by Gen to 
come up with a. As was done in [DKRS06], we will argue security of our construction by showing 
that the MAC scheme implicitly used in our construction remains unforgeable even when A could 
force the verification key to be at an offset (of her choice) from the authentication key. We will 
formalize such an argument by assuming that A learns A = vJ — w. Recall that w* = SRec(u/, s') 
and d = a'\\b' = SS (w*). The following claim that was proven in [DKRS06] states that given 
(A, s), A can compute the offsets A a = a' — a, A& = b' — b induced by her choice of s'. 

Claim 1. Given A = w' — w, and the sketches s, s', A can compute A a = a' — a and A^ = b' — b, 
or determine that Rep will reject before computing a',b' . 

In other words, she can compute the offset between the authentication key that Gen used to 
come up with a and the verification key that Rep will use to verify a' . We will now argue that as 
long as W has sufficient min-entropy, even knowing the offset does not help A succeed in an active 
attack. Recall that since i is arbitrarily fixed by A, A's success depends on w,w', or, alternatively, 
on w,A. Fix some A. For any particular tr, let Succ tr ,A be the event that the transcript is tr and 
A wins, i.e., that f s ,i(a) + b = a\\R A [f s >y{a') + b']\ = a' A SS{w) = s, conditioned on the fact that 
w' — w is A. We denote by Bad tr ,A the set of w that make Succ tr ,A true. We now partition the set 
Bad tr ,A into 2 £ disjoint sets, indexed by R' G {0, l} e : 

Ba< A = {w\wEBa6 trA A[f s>}il (a') + b']i+i = R'} 

= {w | (/ 8)i (o) + b = a\\R) A (/ sV (a') + *>' = a'\\R') A SS(u/) = s}. 

By Claim 1, fixing (tr, A), also fixes A a , A&. It follows that every w € Bad(J A needs to satisfy 

/«,<(<*) - f s >,i'(a + A ) = {A b + a - a')\\(R - R') A SS(w) = s. 

For a given tr, A, R', the right hand side of the first equation takes a fixed value. Let us now focus 
on the polynomial f s ,i(a) — f s ',i'( a + A ). We will consider two cases: 

• A a = 0: In this case, f s ,i(x) — fs',i'( x ) is a polynomial in which a coefficient of degree 2 or 
higher is nonzero if s ^ s' and a coefficient of degree 1 or higher is nonzero if i ^ i' . 

• A a ^ 0: Observe that the leading term of the polynomial is ((L + 3) mod 2)A a x L+2 . Since we 
forced L to be even, the coefficient of the leading term is nonzero, making f s ^(x) — f s iy{x+A a ) 
a polynomial of degree L + 2. 

Therefore, in either case, the f s ,i(x) — f s ',i'{ x + A a ) is a nonconstant polynomial of degree at most 
L + 2. A nonconstant polynomial of degree d can take on a fixed value at most d times. It, 
therefore, follows that there are at most L + 2 values of a such that f s ,i(a) — f s \i'(a + A a ) = 
(A{, + a — a')\\(R — R'). Each such a uniquely determines b = (cr\\R) — f s> i(a). And w is uniquely 
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determined by c = a\\b = SS (w) and s = SS(w). Therefore, there are at most L + 2 values of 
w in the set Bad^ A i.e, |Bad^ A | < L + 2. Since Bad tr ,A = Uffeio 1}' Bad^ A , we S e * |Bad tr ,A| < 
(L + 2)2^ = (L + 2)2 n '/ 2 -^. Thus, Pr w [Succ tr , A ] < |Bad tr |2- H °°HA) < (L + 2)2 n '/ 2 -*<- H -HA)_ 

To find out the probability Pr^SuccA] that A succeeds conditioned on a particular A, we need 
to add up Pruj[Succtr,A] over all possible transcripts. Recalling that each transcript is determined 
by a, R and s and hence there are 2 n ' 2+k of them, and that n' + k = n, we get Pr^fSuccA] < 
(L + 2)2 n ~' u ~ Hoo ( ,i; l A ). 

Finally, the probability of adversarial success it at most 

EPr[Succ A ] < (L + 2 )2 n - u - fi °°HA) _ 

A w 

In particular, if the errors A are independent of w, then H 00 (w|A) = Hoo(w) = m, and the 
probability of adversarial success is at most (L + 2)2 n ~ v ~ m . In the worst case, however, the 
entropy of w may decrease at most by the number of bits needed to represent A. Let B be the 
volume of the hamming ball of radius t in {0, l} n . Then, A can be represented in log-B bits and 
H 00 (u;|A) >m — \ogB, by Lemma[2j From here it follows that 



n—v—m 



Pr[A's success] < B(L + 2)2 

To achieve S— robustness, we want B(L + 2)2 n_ll_m < 5 i.e., v > n — m + log B + log(L + 2) + log ^. 
Setting v = n — m + log B + log(L + 2) + log ^, and using L = 2 \^rj{\ it follows that 



< m — n/2 — k — log B — log 2 



n — k 



+ 2) - log ] 



□ 



5.1 Getting Closer to Uniform 

If e is so low that m > -^{n + k) + 2 log- does not hold, we can modify our construction just 

as we did in section 13.11 by shortening R by [3 = ^i 71 + k) + 2 log m. That is, keep v = 

n-m + \ogB + log(L + 2) + log ± fixed and let R = [f s ,i(a) + b\[\\ , where i < n/2 - v - /3. 
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